92 research outputs found
Weakly supervised coupled networks for visual sentiment analysis
Automatic assessment of sentiment from visual content
has gained considerable attention with the increasing tendency
of expressing opinions on-line. In this paper, we solve
the problem of visual sentiment analysis using the high-level
abstraction in the recognition process. Existing methods
based on convolutional neural networks learn sentiment
representations from the holistic image appearance. However,
different image regions can have a different influence
on the intended expression. This paper presents a weakly
supervised coupled convolutional network with two branches
to leverage the localized information. The first branch
detects a sentiment specific soft map by training a fully convolutional
network with the cross spatial pooling strategy,
which only requires image-level labels, thereby significantly
reducing the annotation burden. The second branch utilizes
both the holistic and localized information by coupling
the sentiment map with deep features for robust classification.
We integrate the sentiment detection and classification
branches into a unified deep framework and optimize
the network in an end-to-end manner. Extensive experiments
on six benchmark datasets demonstrate that the
proposed method performs favorably against the state-ofthe-
art methods for visual sentiment analysis
Clinical skin lesion diagnosis using representations inspired by dermatologist criteria
The skin is the largest organ in human body. Around 30%-70% of individuals worldwide have skin related health problems, for whom effective and efficient diagnosis is necessary. Recently, computer aided diagnosis (CAD) systems have been successfully applied to the recognition of skin cancers in dermatoscopic images. However, little work has concentrated on the commonly encountered skin diseases in clinical images captured by easily-accessed cameras or mobile phones. Meanwhile, for a CAD system, the representations of skin lesions are required to be understandable for dermatologists so that the predictions are convincing. To address this problem, we present effective representations inspired by the accepted dermatological criteria for diagnosing clinical skin lesions. We demonstrate that the dermatological criteria are highly correlated with measurable visual components. Accordingly, we design six medical representations considering different criteria for the recognition of skin lesions, and construct a diagnosis system for clinical skin disease images. Experimental results show that the proposed medical representations can not only capture the manifestations of skin lesions effectively, and consistently with the dermatological criteria, but also improve the prediction performance with respect to the state-of-the-art methods based on uninterpretable features
Dynamic match kernel with deep convolutional features for image retrieval
For image retrieval methods based on bag of visual words, much attention has been paid to enhancing the discriminative powers of the local features. Although retrieved images are usually similar to a query in minutiae, they may be significantly different from a semantic perspective, which can be effectively distinguished by convolutional neural networks (CNN). Such images should not be considered as relevant pairs. To tackle this problem, we propose to construct a dynamic match kernel by adaptively calculating the matching thresholds between query and candidate images based on the pairwise distance among deep CNN features. In contrast to the typical static match kernel which is independent to the global appearance of retrieved images, the dynamic one leverages the semantical similarity as a constraint for determining the matches. Accordingly, we propose a semantic-constrained retrieval framework by incorporating the dynamic match kernel, which focuses on matched patches between relevant images and filters out the ones for irrelevant pairs. Furthermore, we demonstrate that the proposed kernel complements recent methods, such as hamming embedding, multiple assignment, local descriptors aggregation, and graph-based re-ranking, while it outperforms the static one under various settings on off-the-shelf evaluation metrics. We also propose to evaluate the matched patches both quantitatively and qualitatively. Extensive experiments on five benchmark data sets and large-scale distractors validate the merits of the proposed method against the state-of-the-art methods for image retrieval
Curriculum CycleGAN for Textual Sentiment Domain Adaptation with Multiple Sources
Sentiment analysis of user-generated reviews or comments on products and
services in social networks can help enterprises to analyze the feedback from
customers and take corresponding actions for improvement. To mitigate
large-scale annotations on the target domain, domain adaptation (DA) provides
an alternate solution by learning a transferable model from other labeled
source domains. Existing multi-source domain adaptation (MDA) methods either
fail to extract some discriminative features in the target domain that are
related to sentiment, neglect the correlations of different sources and the
distribution difference among different sub-domains even in the same source, or
cannot reflect the varying optimal weighting during different training stages.
In this paper, we propose a novel instance-level MDA framework, named
curriculum cycle-consistent generative adversarial network (C-CycleGAN), to
address the above issues. Specifically, C-CycleGAN consists of three
components: (1) pre-trained text encoder which encodes textual input from
different domains into a continuous representation space, (2) intermediate
domain generator with curriculum instance-level adaptation which bridges the
gap across source and target domains, and (3) task classifier trained on the
intermediate domain for final sentiment classification. C-CycleGAN transfers
source samples at instance-level to an intermediate domain that is closer to
the target domain with sentiment semantics preserved and without losing
discriminative features. Further, our dynamic instance-level weighting
mechanisms can assign the optimal weights to different source samples in each
training stage. We conduct extensive experiments on three benchmark datasets
and achieve substantial gains over state-of-the-art DA approaches. Our source
code is released at: https://github.com/WArushrush/Curriculum-CycleGAN.Comment: Accepted by WWW 202
Retrieving and classifying affective Images via deep metric learning
Affective image understanding has been extensively studied
in the last decade since more and more users express emotion
via visual contents. While current algorithms based on convolutional
neural networks aim to distinguish emotional categories
in a discrete label space, the task is inherently ambiguous.
This is mainly because emotional labels with the same
polarity (i.e., positive or negative) are highly related, which is
different from concrete object concepts such as cat, dog and
bird. To the best of our knowledge, few methods focus on
leveraging such characteristic of emotions for affective image
understanding. In this work, we address the problem of understanding
affective images via deep metric learning and propose
a multi-task deep framework to optimize both retrieval
and classification goals. We propose the sentiment constraints
adapted from the triplet constraints, which are able to explore
the hierarchical relation of emotion labels. We further
exploit the sentiment vector as an effective representation to
distinguish affective images utilizing the texture representation
derived from convolutional layers. Extensive evaluations
on four widely-used affective datasets, i.e., Flickr and Instagram,
IAPSa, Art Photo, and Abstract Paintings, demonstrate
that the proposed algorithm performs favorably against the
state-of-the-art methods on both affective image retrieval and
classification task
Computational Emotion Analysis From Images: Recent Advances and Future Directions
Emotions are usually evoked in humans by images. Recently, extensive research
efforts have been dedicated to understanding the emotions of images. In this
chapter, we aim to introduce image emotion analysis (IEA) from a computational
perspective with the focus on summarizing recent advances and suggesting future
directions. We begin with commonly used emotion representation models from
psychology. We then define the key computational problems that the researchers
have been trying to solve and provide supervised frameworks that are generally
used for different IEA tasks. After the introduction of major challenges in
IEA, we present some representative methods on emotion feature extraction,
supervised classifier learning, and domain adaptation. Furthermore, we
introduce available datasets for evaluation and summarize some main results.
Finally, we discuss some open questions and future directions that researchers
can pursue.Comment: Accepted chapter in the book "Human Perception of Visual Information
Psychological and Computational Perspective
- …